Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity
نویسندگان
چکیده
We consider a distributionally robust partially observable Markov decision process (DR-POMDP), where the distribution of transition-observation probabilities is unknown at beginning each period, but their realizations can be inferred using side information end period after an action being taken. build ambiguity set joint bounded moments via conic constraints and seek optimal policy to maximize worst-case (minimum) reward for any in set. show that value function DR-POMDP piecewise linear convex with respect belief state propose heuristic search iteration method obtaining lower upper bounds function. conduct numerical studies demonstrate computational performance our approach testing instances dynamic epidemic control problem. Our results produce more policies under misspecified distributions as compared POMDP has less costly solutions than POMDP. The are also insensitive varying parameter noise added true probability values obtained period.
منابع مشابه
Robust partially observable Markov decision process
We seek to find the robust policy that maximizes the expected cumulative reward for the worst case when a partially observable Markov decision process (POMDP) has uncertain parameters whose values are only known to be in a given region. We prove that the robust value function, which represents the expected cumulative reward that can be obtained with the robust policy, is convex with respect to ...
متن کاملThe Infinite Partially Observable Markov Decision Process
The Partially Observable Markov Decision Process (POMDP) framework has proven useful in planning domains where agents must balance actions that provide knowledge and actions that provide reward. Unfortunately, most POMDPs are complex structures with a large number of parameters. In many real-world problems, both the structure and the parameters are difficult to specify from domain knowledge alo...
متن کاملText Understanding With Partially Observable Markov Decision Process
The process of understanding the meaning of a written passage inherently involves dynamic manipulation and composition of ideas. Starting from this observation this thesis proposes an artificial system for text understanding in which the semantic space containing the possible meanings of the analyzed text is selectively explored by a partially observable Markov decision process trained to effec...
متن کاملDistributionally Robust Markov Decision Processes
We consider Markov decision processes where the values of the parameters are uncertain. This uncertainty is described by a sequence of nested sets (that is, each set contains the previous one), each of which corresponds to a probabilistic guarantee for a different confidence level so that a set of admissible probability distributions of the unknown parameters is specified. This formulation mode...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Siam Journal on Optimization
سال: 2021
ISSN: ['1095-7189', '1052-6234']
DOI: https://doi.org/10.1137/19m1268410